-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
helm: add rate limit env variable to launch workflow #619
Conversation
Codecov Report
@@ Coverage Diff @@
## master #619 +/- ##
=======================================
Coverage 19.34% 19.34%
=======================================
Files 26 26
Lines 2078 2078
=======================================
Hits 402 402
Misses 1676 1676 |
caafe8d
to
b2c6c31
Compare
helm/reana/README.md
Outdated
@@ -20,6 +20,7 @@ This Helm automatically prefixes all names using the release name to avoid colli | |||
| `components.reana_server.environment.REANA_WORKFLOW_SCHEDULING_POLICY` | Define workflow scheduling strategy. Options are "fifo" for first-in-first-out strategy regardless of users and "balanced" for multi-user-aware scheduling strategy. | "fifo" | | |||
| `components.reana_server.environment.REANA_RATELIMIT_GUEST_USER` | Set API limiter config for guest users. Users using reana-client will be treated as guests. | "20 per second" | | |||
| `components.reana_server.environment.REANA_RATELIMIT_AUTHENTICATED_USER` | Set API limiter config for authenticated web UI users. | "20 per second" | | |||
| `components.reana_server.environment.REANA_RATELIMIT_LAUNCH_WORKFLOW` | Set API limiter config for launch workflow endpoint. | "1/30 second" | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested the PR set, works nicely, just a couple of questions:
-
Perhaps it's worth to go for a more generic "fast" and a "slow" endpoint solution as described in the issue? Creating a separate envar only for
launch
endpoint seems to be too specific. WDYT? cc @tiborsimko -
"1/30 second" rate limits seems to be too strict for me? Perhaps we should relax it a bit to something like "1/5 second"
-
I haven't looked deeply, but would it be possible to return more user friendly error messages if rate limit is not respected? For example if user hits
Launch on REANA
web page twice in 30 seconds, this is what will be displayed:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
"1/5 second" means "1 per 5 seconds" which is stricter than "1/30 second" which means "1 per 30 seconds". We can make it "1/15 second" which will mean "1 per 15 seconds".
-
Good point. There are two ways how to make it nice: change UI or change backend. With the backend, I think, it is possible with
limiter
andflask-limiter
libraries. But, we rely oninvenio-app
so I will need to check if they allow it somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"1/30 second" allows a user to launch a workflow only once per 30 seconds
while "1/5 second" allows to do it every 5 seconds, seems less strict to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ops. You are right. My brain is not working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WRT 1 I agree with @audrium, as we'll soon have other possibly "slow" endpoints, such as for validation of docker images that would need fetching, spawning pods, etc. Could create DoS kind of workload if someone overdoes those. So we could have the generic REANA_RATELIMIT_GUEST_USER
and REANA_RATELIMIT_AUTHENTICATED_USER
variables for default "fast" endpoints, as up to now, and we could introduce a new REANA_RATELIMIT_SLOW
variable for "slow" endpoints that need to be protected, and start to applying them e.g. to workflow launcher, to notebook launcher, etc.
We could actually proactively introduce three values already, *_FAST
, *_MEDIUM
, *_SLOW
if we want, and start tagging other endpoints as well. For example, for ATLAS pMSSM use case, where there are many concurrent workflows running, one may want to go into 1000 per second
for some endpoints. So we could start already tagging via fast/medium/slow those endpoints which are needed for internal system working (and need to be ultra permissive), which are just for regular user consumption wondering about status (and which are medium default), which might be too time consuming (and need slow protection).
(As for naming, fast/medium/slow might be enough, we could introduce "rocket" or something later if need be. Or we could invent some fancy naming scheme such as cheetah/rabbit/ant/snail that would also be nicely extensible in the future if there will be a need for more fine-grained categorisation than just fast/slow.).
WRT 2 allowing 1 launch each 5 seconds seems reasonable.
WRT 3 prettifying the output message would be a nice bonus. See also #286 how the rate limiter error message looks like in regular reana-client
usage scenario if user has many input files. Currently it is not easily understandable by users. So improving it for both Web UI users and CLI UI users would be really nice to have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Changed to
REANA_RATELIMIT_SLOW
. I think we can doFAST
andMEDIUM
later when it is needed. I will create an issue. -
Change to 1/5
-
Can be a separate issue to improve Web and CLI messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created reanahub/reana-server#455 to address the 3rd point
a9e58d4
to
6732776
Compare
helm/reana/README.md
Outdated
@@ -20,6 +20,7 @@ This Helm automatically prefixes all names using the release name to avoid colli | |||
| `components.reana_server.environment.REANA_WORKFLOW_SCHEDULING_POLICY` | Define workflow scheduling strategy. Options are "fifo" for first-in-first-out strategy regardless of users and "balanced" for multi-user-aware scheduling strategy. | "fifo" | | |||
| `components.reana_server.environment.REANA_RATELIMIT_GUEST_USER` | Set API limiter config for guest users. Users using reana-client will be treated as guests. | "20 per second" | | |||
| `components.reana_server.environment.REANA_RATELIMIT_AUTHENTICATED_USER` | Set API limiter config for authenticated web UI users. | "20 per second" | | |||
| `components.reana_server.environment.REANA_RATELIMIT_SLOW` | Set API limit config for slow endpoints. | "1/5 second" | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could enhance the message for the user to give a bit more context:
Set API limiter config for slow endpoints that need to be protected e.g. launch endpoint
or something similar, WDYT?
Also, we could add a similar message to the changelog
6732776
to
52010f6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works nicely!
closes reanahub/reana-server#443